SURVEY OF ENERGY EFFICIENT HIGH PERFORMANCE
LOW POWER ROUTER FOR NETWORK ON CHIP

M.Deivakani M.E, Associate Professor / ECE Department\*

Dr.D.Shanthi M.E., Ph.D., Professor/ CSE Department\*

#### **ABSTRACT**

The Increasing complexity of systems-on-chip (SOCs) pushes researchers to propose efficient Networks-on-Chip (NOCs). Efficient exploitation of performance and scalability are the main advantages of NOCs. Routers in NOCs are used to multiplex packets onto the network links.. An important research in router design is the tradeoff between area/power and performance. In this paper we survey efficient router for High performance NOCs.

**Keywords-**Network on chip, router design, Low power architecture

#### 1.Introduction

The on-chip interconnect is expected to play a important role in dictating the performance ,energy, and fault-tolerance of the overall system with technology scaling .Design and analysis of scalable on-chip interconnects, is commonly known as Network-on-Chip which comes with a different flavor because of the area, energy and reliability constraints in deep sub-micron design.

#### 1.1 Basic Architecture of Router

NoCs can be modeled as a graph  $G = \langle R, L \rangle$ , where R represents set of routers, and the L represents its bidirectional communication link. Each link is used to enable the communication.

<sup>\*</sup> PSNACET, Dindugul

To improve the NOC Performance Channels can be multiplexed, allowing the use of a same channel by different flows in the same direction.



#### 1.2 Basic Router Functionalities

#### 1.2.1Route Processing

Routing Process includes routing table construction and maintenance using routing protocols to learn about and create a view of the network's topology

#### 1.2.2 Packet Forwarding:

Typically, IP packet forwarding requires the following requirements:



#### Volume 2, Issue 3

ISSN: 2320-0294

- ➤ IP Packet Validation: Before proceeds proceeds with protocol processing, the router should check that the received packet is properly formed for the protocol .It include checking the version number, checking the header length field and calculating the header checksum.
- Destination IP Address Parsing and Table Lookup: The router performs a table lookup to determine the output port onto which to direct the packet and next hop. This is depends on the
- A unicast delivery to a single output port, either to the ultimate destination station or to a next-hop router.
- A multicast delivery to a set of output ports that depends on the router's Knowledge of multicast group membership.
- Packet Lifetime Control: Adjustment of the time-to-live (TTL) field in the packet is used to prevent packets from circulating endlessly throughout the internet work. If a packet has any positive value of TTL then it is being delivered to a local address within the router is acceptable. If as its TTL value decremented as appropriate and then it is rechecked to determine if it has any life before it is actually forwarded. A packet which has exceeded lifetime is removed by the router

#### 2. Flexible router

The Flexible Router [1] achieves better performance by providing a way to handle the requests to a busy buffer by other buffers. Its performance also improved in terms of increasing the saturation rate for Hotspot, Uniform, and Nearest-Neighbor traffic patterns, especially Hotspot with an 11.4% increase



Figure 3. Flexible router architecture.

ports . Flexible router [1] are different from base router. The modified modules are shown in figure 3. with dotted frames and the added signals with dotted arrow . It has five input and output ports but now the input ports are different. Flexible router is free from dead lock problem. Here XY routing is utilized which lead deadlock free router. By use of efficient free buffer , connotation problem was eliminated at the destination . It has higher performance for Hotspot, Uniform and Nearest-Neighbor traffic patterns which increase the saturation rate by 11.4% for HS. Area overhead was 17.8% increased in LUTs and 11.7% increased in FFs

# 3. Router for High-Performance Intrachip Networks

This work {2] has two main objectives, The first one is to discuss performance trade-offs for switching modes and physical channel allocation policies. The second objective is to propose increasing the number of protocol layers addressed by NoC infrastructures

#### 3.1 SWITCHING MODES IN NOCS

Messages are nothing but data. It have to be sent from a sender to a receiver through a network. By encapsulating all or part of each message with network control information, Messages can be transformed into packets Similarly, messages can be sent after a connection establishment between the sender and the receiver. This leads the two basic modes for message transmission in networks, circuit switching and packet switching, respectively Both replicated channels and circuit switching achieve latency reduction through congestion reduction. Router bandwidth is increased in Replicated channels whereas circuit switching coupled with a session protocol layer maximizes the physical channel utilization. The method [2]reduces both circuit area and latency, and it is an advantageous alternative to the use of virtual channels.

#### 4. High Performance Hybrid Two - Layer Router Architecture

In paper [3] a novel micro-architecture for a hybrid two-layer router is proposed. It supports packet-switched communications, across its both local and directional ports. Similarly it [3] supports, time multiplexed circuit switched

communications among the multiple IP cores directly connected to it. The two main concerns with NoC designs that are strictly packet-switched are the control and serialization overhead involved in transferring data between IP cores that are placed close to each other in the FPGA. To ensure high throughput between these cores, they [3]advocate time multiplexed circuit-switched c on n e c t i on s . In addition, t h e router also preserves the online nature o f communication between farther cores through the packet-switched

layer. The area efficient MoCReS architecture is modified to support both the above mentioned layers of operation. They [3] also developed a SystemC model of their router for both functionally verifying the design as well as to vary its specifications and obtain the performance

results rapidly through simulation. The architecture of Hybrid Two -Layer Router[3] is shown



Figure 4. Architecture of Hybrid Two-Layer Router

Online scheduling by dynamically negotiating communication between the cores is performed by packet -switching. Alternately, circuit-switching offers high through- put dedicated connections to overcome the performance drawbacks in packet-switching. It is achieved by scheduling time multiplexed communication across the cores. Al though this static scheduling requires all the communication patterns to be known before hand, it can pro- vide a very high throughput with marginal area overhead (for storing schedules). In this article [3] a modified router architecture is proposed which interfaces multiple IP cores to the router and supports packet-switching for inter router transfers and time-multiplexed circuit- switching for IP cores connected to the same router. This approach a 1 s o eliminates the latency in req/grant protocol, serialization and control overheads for data transfers between cores placed close to each other in FPGAs and mapped to the same router. The modern router architecture [3]achieves an average improvement of 20.4% in NoC bandwidth (maximum of 24% compared to a traditional NoC).

#### 5. Network-on-chip router with fault -tolerant

This article [4] reports a fault –tolerant network on chip for a comprehensive fault study of a NoC router through a simulation-based method. The evaluation of crosstalk, single-event upset (SEU)and single-event transient fault injections shows that up to 53% of the injected faults cause a system failure. About 45% of them are replaced by new values before turning into errors.



Volume 2, Issue 3

ISSN: 2320-0294

Only 1% of them are treated as latent errors. Routing units and switch components are known as the two most unstable elements with regard to transient injected faults, with failure rates of 60% and 55%, respectively. Faulty operation of the NoC interconnection might influence the functionality of connected processor elements (PEs). Faults such as cross talk, electro migration, electro magnetic interference (EMI), alpha particle hits and also cosmic radiation can lead to failure of NOCs

#### **5.1 Fault injection characteristics**

Fault injection is a most popular technique to evaluate the dependability attributes of a system in a quantitative manner. The transient faults—I n digital systems is a matter of considerable concern compared to permanent ones. Due to technology scaling he frequency of transient faults is expected to increase in future systems. The injected faults are treated as permanent if the total simulation period equals the life time

#### **5.2 Fault injection results**

Faults are basically classified into two groups: propagated and non-propagated. The propagated faults are real faults ,that can be partitioned into two subclasses, latent and active. Propagated faults that have not been detected yet, are called latent On the other hand, detected faults are called active. If Latent faults have been detected once they are turn into active ones.. They also might be replaced with new values before being detected, calling overwritten. A system never suffers from overwritten faults, unless the new value is also infected. Similar definition may also apply to non-propagated errors. The NoC [4] router is simulated with a clock cycle period of 10 ns. In this paper[4], a synchronous NoC router architecture is introduced which is capable of receiving and transmitting a flit in four clock cycles. The router fault tolerance behaviour has also been considered to keep the penalty of redundancy overheads low in future designs.. An information redundancy remedy, called as CPRS, is evaluated for achieving robustness against SEU faults which results in almost a 50% reduction of routing unit failure rate.

#### 6. A Router Architecture MANGO Clockless Network-on-Chip

In this article [5] we present a worm-hole NoC router to be used in MANGO (Message-passing Asynchronous Network-on-Chip providing Guaranteed services through OCP interfaces) is proposed. This NoC consists of network adapters (NA), routers and links. In this article [5] implementation is based on clockless circuit techniques. MANGO is the first clockless NoC to provide connection-oriented guaranteed services (GS) as well as connection-less best-effort (BE) routing .Features, area and performance of MANGO router is comparable to a similar class of clocked routers. Additional advantages are an inherent support for GALS systems and zero dynamic idle power.



Figure 5. The Architecture Of MANGO Router

The MANGO NoC is shown in figure 5.It consists of network adapters (NA), routers and links. Through an NA ,Each IP core is connected to the network , It provides high level communication services. Each NA, which also performs the synchronization between The clockless net and clocked IP core The routers are connected by links in a grid type structure, may be heterogeneous or homogeneous. To keep speed up, long links are

implemented as pipelines. Internally, the router consists of a BE router, a GS router, output buffers, and link arbiters. Both BE and the GS router are separately implemented

#### 6.1 The BE Router

The BE router, implemented in a simple source routing scheme. The header flit is the first flit of a packet. The two most significant bits of the header indicate one of four output ports. The packet is routed to the local port by choosing a direction back to where it came from, The header is then rotated two bits, positioning the header bits for the next hop. With 32-bit flits, a packet can make a total of 15 hops. The length of each packet is variable; last flit is indicated by the control bit. The architecture of BE Router is shown in Figure 6



Figure 6. The Architecture of BE router.

#### 6.2 The GS Router

The remaining VCs are utilized to route header-less data streams on statically programmable connections. To make hard service guarantees, GS connections must be logically independent of other network traffic. In MANGO [5], a connection implements a logical point-to-point circuit between two different local ports. The GS router provides non-blocking switching between the input ports and the output buffers. GS router can be realized purely on the

basis of link access arbitration. The GS connections are set up by programming these into the GS router via the BE router is shown in Figure 7



Figure 7. The BE router is integrated into the GS router, using a subset of the VCs.

The MANGO [5] router is implemented using clockless circuit techniques .It inherently supports a modular, GALS oriented design flow. It exploits virtual channels to provide connection-oriented service guarantees (GS), as well as connection-less best-effort routing.

IJESM

Volume 2, Issue 3

ISSN: 2320-0294

#### 7. Energy-Efficient Modular Router

In article [6] a new router architecture, called Row-Column Decoupled Router, suitable for on-chip interconnects is introduced. This router considers the three desirable objective functions: performance, energy and fault-tolerance, in exploring the design space. This two-stage wormhole switched NoC router has a number of features. It makes it distinct compared to the earlier designs. First Feature is Instead of a larger 5\*5 crossbar, it uses two smaller 2\*2 crossbars. It is traditionally used for 2D mesh networks. Second Feature it uses a path-sensitive buffering scheme, where, the virtual channels are divided into four sets to support dedicated row and column routing in the two crossbars. These two features along with early ejection, help in reducing the contention. Third Feature unlike most earlier designs, It shows how deterministic (XY) routing, XY-YX routing and adaptive routing can be supported in this architecture. Fourth Feature, because of the modular design, It shows how different types of faults such as VA, SA, and crossbar failures can be handled with graceful degradation. Thereby providing better fault-tolerance compared to earlier designs.

#### 8. Conclusion

In this paper we have given brief survey of various router such as Flexible router[1], Router for High-Performance Intrachip Networks[2] High Performance Hybrid Two - Layer Router Architecture[3] Network-on-chip router with fault -tolerant[4] A Router Architecture MANGO Clockless Network-on-Chip[5] Energy-Efficient Modular Router[6]. The Design techniques involved in this Routers are most useful for designing High performance low power ,Efficient and fault free Network On Chip

#### REFERENCES

- [1] Mostafa S. Sayed Ahmed Shalaby, Mohamed El-Sayed, Victor Goulart "Flexible router architecture for network-on-chip" *International journal of Computers and Mathematics with Applications* 64 (2012) 1301–1310
- [2] Everton Carara, Ney Calazans, Fernando Moraes" A New Router Architecture for High-Performance Intrachip Networks" *Journal Integrated Circuits and Systems* 2008; v.3 / n.1:23-31
- [3] P.Ezhumalai 1 Dr. C.Arun, S.Manojkumar Dr.P.Sakthive, Dr.D.Sridharan" High performance Hybrid Two Layer Router Architecture for FPGAs Using Network-On- Chip" (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 1, 2010
- [4] Ashkan Eghbal a, Pooria M. Yaghini a, H. Pedram a & H. R. Zarandi" Designing fault-tolerant network-onchip router architecture" *International Journal of Electronics*, 97:10, 1181-1192
- [5] Tobias Bjerregaard and Jens Sparsø" A Router Architecture for Connection-Oriented Service Guarantees in the MANGO Clockless Network-on-Chip" *Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE'05) 2005 1530-1591/05*
- [6] Jongman Kim, Chrysostomos Nicopoulos, Dongkook Park, Vijaykrishnan Narayanan, Mazin S. Yousif, Chita R. Das," A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks" *Proceedings of the 33rd International Symposium on Computer Architecture (ISCA'06)2006*
- [7] I. Saastamoinen, M. Alho, J. Nurmi, "Buffer implementation for Proteo network-on-chip, in:" *Proceeding of the International Symposium on Circuits and Systems, ISCAS'03, vol. 2, 2003, pp. II-113–II-116.*

### IJESM

#### Volume 2, Issue 3

### ISSN: 2320-0294

- [8] H. Jingcao, R. Marculescu, "Application- specific buffer space allocation for networks-on-chip router design, in:" *Proceeding of the IEEE/ACM International Conference on Computer Aided Design. ICCAD*, 2004, pp. 354–361
- [9] Bjerregaard, T.; Mahadevan, S. "A survey of research and practices of Network-on-chip". *ACM Computing Surveys*, 38(1), 2006, pp. 1-51.
- [10] Bjerregaard, T.; Sparso, J. "A Router Architecture for Connection-Oriented Service Guarantees in the MANGO Clockless Network-on-Chip In": *Proceedings of the Design, Automation and Test in Europe, DATE'05, 2005, pp. 1226-1231*
- [11] H.-S. Wang, L.-S. Peh, and S. Malik. Power-Driven Design of Router Microarchitectures in On-Chip etworks. *In Proceedings of the 36th MICRO, November 2003*.
- [12] Moraes, F.; Calazans, N.; Mello, A.; Möller, L.; Ost, L. "HERMES: an Infrastructure for Low Area Overhead Packetswitching Networks on Chip". *Integration the VLSI Journal*, 38(1), Oct. 2004, pp. 69-93.
- [13] Ali, M., Welzl, M., Hessler, S., and Hellebrand, S. (2007), 'An Efficient Fault-Tolerant Mechanism to Deal with Permanent and Transient Failures in a Network on Chip', *International Journal of High Performance System Architecture*, 1, 113–123.